2,069 research outputs found
GHN-Q: Parameter Prediction for Unseen Quantized Convolutional Architectures via Graph Hypernetworks
Deep convolutional neural network (CNN) training via iterative optimization
has had incredible success in finding optimal parameters. However, modern CNN
architectures often contain millions of parameters. Thus, any given model for a
single architecture resides in a massive parameter space. Models with similar
loss could have drastically different characteristics such as adversarial
robustness, generalizability, and quantization robustness. For deep learning on
the edge, quantization robustness is often crucial. Finding a model that is
quantization-robust can sometimes require significant efforts. Recent works
using Graph Hypernetworks (GHN) have shown remarkable performance predicting
high-performant parameters of varying CNN architectures. Inspired by these
successes, we wonder if the graph representations of GHN-2 can be leveraged to
predict quantization-robust parameters as well, which we call GHN-Q. We conduct
the first-ever study exploring the use of graph hypernetworks for predicting
parameters of unseen quantized CNN architectures. We focus on a reduced CNN
search space and find that GHN-Q can in fact predict quantization-robust
parameters for various 8-bit quantized CNNs. Decent quantized accuracies are
observed even with 4-bit quantization despite GHN-Q not being trained on it.
Quantized finetuning of GHN-Q at lower bitwidths may bring further improvements
and is currently being explored.Comment: Updated Figure 1 and added additional results in Table 1. Initial
extended abstract version accepted at Edge Intelligence Workshop 2022 for
poster presentatio
GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust Parameters of Unseen Limited Precision Neural Networks
Graph Hypernetworks (GHN) can predict the parameters of varying unseen CNN
architectures with surprisingly good accuracy at a fraction of the cost of
iterative optimization. Following these successes, preliminary research has
explored the use of GHNs to predict quantization-robust parameters for 8-bit
and 4-bit quantized CNNs. However, this early work leveraged full-precision
float32 training and only quantized for testing. We explore the impact of
quantization-aware training and/or other quantization-based training strategies
on quantized robustness and performance of GHN predicted parameters for
low-precision CNNs. We show that quantization-aware training can significantly
improve quantized accuracy for GHN predicted parameters of 4-bit quantized CNNs
and even lead to greater-than-random accuracy for 2-bit quantized CNNs. These
promising results open the door for future explorations such as investigating
the use of GHN predicted parameters as initialization for further quantized
training of individual CNNs, further exploration of "extreme bitwidth"
quantization, and mixed precision quantization schemes.Comment: Poster and extended abstract to be presented at the Workshop for Low
Bit Quantized Neural Networks (LQBNN) @ ICCV 202
Where Should We Begin? A Low-Level Exploration of Weight Initialization Impact on Quantized Behaviour of Deep Neural Networks
With the proliferation of deep convolutional neural network (CNN) algorithms for mobile processing, limited precision quantization has become an essential tool for CNN efficiency. Consequently, various works have sought to design fixed precision quantization algorithms and quantization-focused optimization techniques that minimize quantization induced performance degradation. However, there is little concrete understanding of how various CNN design decisions/best practices affect quantized inference behaviour. Weight initialization strategies are often associated with solving issues such as vanishing/exploding gradients but an often-overlooked aspect is their impact on the final trained distributions of each layer. We present an in-depth, fine-grained ablation study of the effect of different weights initializations on the final distributions of weights and activations of different CNN architectures. The fine-grained, layerwise analysis enables us to gain deep insights on how initial weights distributions will affect final accuracy and quantized behaviour. To our best knowledge, we are the first to perform such a low-level, in-depth quantitative analysis of weights initialization and its effect on quantized behaviour
Recommended from our members
Comparing Propensity Score Methods in Balancing Covariates and Recovering Impact in Small Sample Educational Program Evaluations
Propensity score applications are often used to evaluate educational program impact. However, various options are available to estimate both propensity scores and construct comparison groups. This study used a student achievement dataset with commonly available covariates to compare different propensity scoring estimation methods (logistic regression, boosted regression, and Bayesian logistic regression) in combination with different methods for constructing comparison groups (nearest-neighbor matching, optimal matching, weighting) relative to balancing pre-existing differences and recovering a simulated treatment effect in small samples. Results indicated that applied researchers evaluating program impact should first consider use of standard logistic regression methods with nearest-neighbor or optimal matching or boosted regression in combination with propensity score weighting. Advantages and disadvantages of the methods are discussed. Accessed 12,046 times on https://pareonline.net from November 05, 2013 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right
An Analysis Framework for the Quantization-Aware Design of Efficient, Low-Power Convolutional Neural Networks
Deep convolutional neural network (CNN) algorithms have emerged as a powerful tool for many computer vision tasks such as image classification, object detection, and semantic segmentation. However, these algorithms are computationally expensive and difficult to adapt for resource constrained environments. With the proliferation of CNNs for mobile, there is a growing need for methods to reduce their latency and power consumption. Furthermore, we would like a principled approach to the design and understanding of CNN model behaviour. Computationally efficient CNN architecture design and running inference with limited precision arithmetic (commonly referred to as neural network quantization) have become ubiquitous techniques for speeding up CNN inference speed and reducing their power consumption. This work describes a method for analyzing the quantized behaviour of efficient CNN architectures and subsequently leveraging those insights for quantization-aware design of CNN models.
We introduce a framework for fine-grained, layerwise analysis of CNN models during and after training. We present an in-depth, fine-grained ablation approach to understanding the effect of different design choices on the layerwise distributions of weights and activations of CNNs. This layerwise analysis enables us to gain deep insights on how the interaction of training data, hyperparameters, and CNN architecture can ultimately affect quantized behaviour. Additionally, analysis of these distributions can yield additional insights on how information is propagating through the system. Various works have sought to design fixed precision quantization algorithms and optimization techniques that minimize quantization-induced performance degradation. However, to the best of our knowledge, there has not been any prior works focusing on a fine-grained analysis of why a given CNN's quantization behaviour is observed.
We demonstrate the use of this framework in two contexts of quantization-aware model design. The first is a novel ablation study investigating the impact of random weight initialization on final trained distributions of different CNN architectures and resulting quantized accuracy. Next, we combine our analysis framework with a novel "progressive depth factorization" strategy for an iterative, systematic exploration of efficient CNN architectures under quantization constraints. We algorithmically increase the granularity of depth factorization in a progressive manner while observing the resulting change in layer-wise distributions. Thus, progressive depth factorization enables the gain of in-depth, layer-level insights on efficiency-accuracy tradeoffs. Coupling fine-grained analysis with progressive depth factorization frames our design in the context of quantized behaviour. Thus, it enables efficient identification of the optimal depth-factorized macroarchitecture design based on the desired efficiency-accuracy requirements under quantization
Hot Jupiter Magnetospheres
(Abridged) The upper atmospheres of close-in gas giant exoplanets are
subjected to intense heating/tidal forces from their parent stars.
Atomic/ionized hydrogen (H) layers are sufficiently rarefied that magnetic
pressure may dominate gas pressure for expected planetary magnetic field
strength. We examine the magnetospheric structure using a 3D isothermal
magnetohydrodynamic model that includes: a static "dead zone" near the magnetic
equator containing magnetically confined gas; a "wind zone" outside the
magnetic equator in which thermal pressure gradients and the
magneto-centrifugal-tidal effect give rise to transonic outflow; and a region
near the poles where sufficiently strong tidal forces may suppress transonic
outflow. Using dipole field geometry, we estimate the size of the dead zone to
be ~1-10 planetary radii for a range of parameters. To understand appropriate
base conditions for the 3D isothermal model, we compute a 1D thermal model in
which photoelectric heating from the stellar Lyman continuum is balanced by
collisionally-excited Lyman {\alpha} cooling. This 1D model exhibits a H layer
with temperatures T=5000-10000K down to pressures of 10-100 nbar. Using the 3D
isothermal model, we compute H column densities and Lyman {\alpha} transmission
spectra for parameters appropriate to HD 209458b. Line-integrated transit
depths of 5-10% can be achieved for the above base conditions. Strong magnetic
fields increase the transit signal while decreasing the mass loss, due to
higher covering fraction and density of the dead zone. In our model, most of
the transit signal arises from magnetically confined gas, some of which may be
outside the L1 equipotential. Hence the presence of gas outside the L1
equipotential does not directly imply mass loss. Lastly, we discuss the domain
of applicability for the magnetic wind model described in this paper and in the
Roche-lobe overflow model.Comment: 26 pages, 17 figures (5 color), 2 appendices; submitted to ApJ;
higher resolution version available at
http://www.astro.virginia.edu/~gbt8f/HotJupMag_fullres_astroph.pd
Reproductive factors associated with mammographic density: a Korean co-twin control study
To determine the mechanism by which menstrual and reproductive factors are associated with the risk of breast cancer, we examined the relationships between mammographic density and known menstrual and reproductive risk factors for breast cancer. A co-twin control study was conducted with 122 pairs of monozygotic Korean female twins selected from the Healthy Twin study. Mammographic density was measured from digital mammograms using a computer-assisted method. Information on selected menstrual and reproductive factors was collected through a self-administered questionnaire. Within-pair differences for each mammographic measure were regressed against within-pair differences for each menstrual and reproductive risk factor with an adjustment for body mass index and other menstrual and reproductive factors. The percent dense area was inversely associated with the age at the first full-term childbirth (FFTB) and the number of live births, although the associations were marginally significant with an adjustment for BMI and other reproductive factors. The non-dense area was positively associated with the age at the FFTB and the number of live births. The absolute dense area was positively associated with the duration of breast feeding. The age at menarche was not associated with any component of the mammographic measures. This finding suggests that mammographic density can mediate the protective effect of greater parity against breast cancer, at least in part while age at menarche, age at the FFTB, and breast feeding do not exert their effects through mammographic density.OAIID:oai:osos.snu.ac.kr:snu2011-01/102/0000040632/4SEQ:4PERF_CD:SNU2011-01EVAL_ITEM_CD:102USER_ID:0000040632ADJUST_YN:YEMP_ID:A077602DEPT_CD:902CITE_RATE:4.431FILENAME:reproductive factors associated with mammographic density a korean co-twin control study.pdfDEPT_NM:보건학과SCOPUS_YN:YCONFIRM:
Water dispersible microbicidal cellulose acetate phthalate film
BACKGROUND: Cellulose acetate phthalate (CAP) has been used for several decades in the pharmaceutical industry for enteric film coating of oral tablets and capsules. Micronized CAP, available commercially as "Aquateric" and containing additional ingredients required for micronization, used for tablet coating from water dispersions, was shown to adsorb and inactivate the human immunodeficiency virus (HIV-1), herpesviruses (HSV) and other sexually transmitted disease (STD) pathogens. Earlier studies indicate that a gel formulation of micronized CAP has a potential as a topical microbicide for prevention of STDs including the acquired immunodeficiency syndrome (AIDS). The objective of endeavors described here was to develop a water dispersible CAP film amenable to inexpensive industrial mass production. METHODS: CAP and hydroxypropyl cellulose (HPC) were dissolved in different organic solvent mixtures, poured into dishes, and the solvents evaporated. Graded quantities of a resulting selected film were mixed for 5 min at 37°C with HIV-1, HSV and other STD pathogens, respectively. Residual infectivity of the treated viruses and bacteria was determined. RESULTS: The prerequisites for producing CAP films which are soft, flexible and dispersible in water, resulting in smooth gels, are combining CAP with HPC (other cellulose derivatives are unsuitable), and casting from organic solvent mixtures containing ≈50 to ≈65% ethanol (EtOH). The films are ≈100 µ thick and have a textured surface with alternating protrusions and depressions revealed by scanning electron microscopy. The films, before complete conversion into a gel, rapidly inactivated HIV-1 and HSV and reduced the infectivity of non-viral STD pathogens >1,000-fold. CONCLUSIONS: Soft pliable CAP-HPC composite films can be generated by casting from organic solvent mixtures containing EtOH. The films rapidly reduce the infectivity of several STD pathogens, including HIV-1. They are converted into gels and thus do not have to be removed following application and use. In addition to their potential as topical microbicides, the films have promise for mucosal delivery of pharmaceuticals other than CAP
Iron bioavailability in two commercial cultivars of wheat: a comparison between wholegrain and white flour and the effects of nicotianamine and 2'-deoxymugineic acid on iron uptake into Caco-2 cells
Iron bioavailability in unleavened white and wholegrain bread made from two commercial wheat varieties was assessed by measuring ferritin production in Caco-2 cells. The breads were subjected to simulated gastrointestinal digestion and the digests applied to the Caco-2 cells. Although Riband grain contained a lower iron concentration than Rialto, iron bioavailability was higher. No iron was taken up by the cells from white bread made from Rialto flour or from wholegrain bread from either variety, but Riband white bread produced a small ferritin response. The results probably relate to differences in phytate content of the breads, although iron in soluble monoferric phytate was demonstrated to be bioavailable in the cell model. Nicotianamine, an iron chelator in plants involved in iron transport, was a more potent enhancer of iron uptake into Caco-2 cells than ascorbic acid or 2'-deoxymugineic acid, another metal chelator present in plants
- …